Catch trial performance

## [1] "Excluded 3 participants based on catch-trial performance."

Exclusion of random guesses

We further exclude participants who seem to provide random ratings independent of the scene that they are seeing. We quantify this by computing the mean rating for each utterance across all trials for each participant and computing the correlation between a participant’s actual ratings and their mean rating. A high correlation is unexpected and indicates that a participant chose ratings at random. We therefore also exclude the data from participants for whom this correlation is larger than 0.75.

## `summarise()` has grouped output by 'modal'. You can override using the
## `.groups` argument.
## `summarise()` has grouped output by 'modal', 'percentage_blue'. You can
## override using the `.groups` argument.
## [1] "Excluded 0 participants based on random responses."

Aggregated results

## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.
## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.

Comparison across conditions

## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.

Individual responses

AUC computation

We use the AUC function with the splines method to directly compute the AUC.

t-test and regression model with control variables:

## 
##  Two Sample t-test
## 
## data:  aucs.cautious$auc_diff and aucs.confident$auc_diff
## t = 2.8557, df = 120, p-value = 0.005062
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   3.36670 18.58957
## sample estimates:
## mean of x mean of y 
## 17.239544  6.261407
## 
## Call:
## lm(formula = auc_diff ~ cond + test_order + first_speaker_type + 
##     confident_speaker, data = rbind(aucs.cautious, aucs.confident))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -44.687 -13.660  -0.413  11.727  61.374 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                        4.050      3.979   1.018  0.31094   
## condconfident (probably-biased)  -10.978      3.601  -3.048  0.00284 **
## test_orderreverse                  9.655      3.610   2.674  0.00856 **
## first_speaker_typeconfident       10.918      3.606   3.027  0.00303 **
## confident_speakerconfidentm        5.421      3.610   1.502  0.13586   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.89 on 117 degrees of freedom
## Multiple R-squared:  0.1989, Adjusted R-squared:  0.1715 
## F-statistic:  7.26 on 4 and 117 DF,  p-value: 2.941e-05

Clustering analyses

library(mclust)
## Package 'mclust' version 5.4.10
## Type 'citation("mclust")' for citing this R package in publications.
## 
## Attaching package: 'mclust'
## The following object is masked from 'package:DescTools':
## 
##     BrierScore
## The following object is masked from 'package:bootstrap':
## 
##     diabetes
aucs_diff = merge(aucs.cautious, aucs.confident, by=c("workerid"))
aucs_diff$diff_of_diffs = aucs_diff$auc_diff.x - aucs_diff$auc_diff.y

aucs_diff %>% ggplot(aes(x=diff_of_diffs)) + geom_density() + geom_jitter(aes(y=0), width=0, height=0.001)  + ggtitle("Raw data + estimated density")

Gaussian mixture models of diffeences of AUC differences

1 Cluster

fit1 = Mclust(aucs_diff$diff_of_diffs, G=1)
print(summary(fit1, parameters=2))
## ---------------------------------------------------- 
## Gaussian finite mixture model fitted by EM algorithm 
## ---------------------------------------------------- 
## 
## Mclust X (univariate normal) model with 1 component: 
## 
##  log-likelihood  n df       BIC       ICL
##       -281.9502 61  2 -572.1221 -572.1221
## 
## Clustering table:
##  1 
## 61 
## 
## Mixing probabilities:
## 1 
## 1 
## 
## Means:
## [1] 10.97814
## 
## Variances:
## [1] 605.704

2 Clusters

fit2 = Mclust(aucs_diff$diff_of_diffs, G=2)
print(summary(fit2, parameters=T))
## ---------------------------------------------------- 
## Gaussian finite mixture model fitted by EM algorithm 
## ---------------------------------------------------- 
## 
## Mclust E (univariate, equal variance) model with 2 components: 
## 
##  log-likelihood  n df       BIC       ICL
##       -275.9833 61  4 -568.4101 -576.1991
## 
## Clustering table:
##  1  2 
## 51 10 
## 
## Mixing probabilities:
##         1         2 
## 0.8205388 0.1794612 
## 
## Means:
##         1         2 
##  2.048788 51.805221 
## 
## Variances:
##        1        2 
## 241.1448 241.1448

3 Clusters

fit3 = Mclust(aucs_diff$diff_of_diffs, G=3)
print(summary(fit3, parameters=T))
## ---------------------------------------------------- 
## Gaussian finite mixture model fitted by EM algorithm 
## ---------------------------------------------------- 
## 
## Mclust E (univariate, equal variance) model with 3 components: 
## 
##  log-likelihood  n df       BIC       ICL
##        -276.006 61  6 -576.6773 -631.9808
## 
## Clustering table:
##  1  2  3 
##  8 43 10 
## 
## Mixing probabilities:
##         1         2         3 
## 0.3230834 0.5026031 0.1743135 
## 
## Means:
##         1         2         3 
## -2.083159  4.999926 52.423891 
## 
## Variances:
##        1        2        3 
## 233.1972 233.1972 233.1972

According to the Bayesian information criterion, a model with two clusters describes the data best.

Fitted model:

aucs_diff %>% 
  ggplot(aes(x=diff_of_diffs)) + 
    geom_jitter(aes(y=0, color=first_speaker_type.x), width=0, height=0.001)  +
    ggtitle("Raw data + Components of gaussian mixture") + 
    stat_function(fun = dnorm, args = list(mean = fit2$parameters$mean[1], sd = sqrt(fit2$parameters$variance$sigmasq[1]))) + 
    stat_function(fun = dnorm, args = list(mean = fit2$parameters$mean[2], sd = sqrt(fit2$parameters$variance$sigmasq[2])))
## Warning: Removed 101 row(s) containing missing values (geom_path).

Compute likelihoods based on the adaptation model

## # A tibble: 244 × 5
##    workerid condition most_likely_model name                 value
##       <int> <chr>     <chr>             <chr>                <dbl>
##  1     1436 cautious  confident         likelihood.cautious  -839.
##  2     1436 cautious  confident         likelihood.confident -800.
##  3     1436 confident cautious          likelihood.cautious  -945.
##  4     1436 confident cautious          likelihood.confident -970.
##  5     1437 cautious  cautious          likelihood.cautious  -413.
##  6     1437 cautious  cautious          likelihood.confident -537.
##  7     1437 confident cautious          likelihood.cautious  -307.
##  8     1437 confident cautious          likelihood.confident -496.
##  9     1438 cautious  cautious          likelihood.cautious  -589.
## 10     1438 cautious  cautious          likelihood.confident -641.
## # … with 234 more rows

List of adapters:

workerid first_speaker_type test_order noticed_manipulation cautious_count confident_count aligned_count first_adaptation_speaker_count
1438 cautious parallel 1 1 1 2 1
1439 confident parallel 0 1 1 2 1
1443 cautious reverse 1 1 1 2 1
1447 cautious parallel 0 1 1 2 1
1453 confident parallel 0 1 1 2 1
1461 confident reverse 0 1 1 2 1
1466 cautious reverse 1 1 1 2 1
1467 cautious reverse 0 1 1 2 1
1472 cautious reverse 0 1 1 2 1
1477 cautious parallel 1 1 1 2 1
1480 cautious parallel 1 1 1 2 1
1487 confident parallel 1 1 1 2 1
1490 cautious reverse 0 1 1 2 1
1491 cautious reverse 1 1 1 2 1
1496 cautious reverse 1 1 1 2 1
1497 confident reverse 1 1 1 2 1
1501 cautious reverse 1 1 1 2 1

List of reverse adapters:

workerid first_speaker_type test_order noticed_manipulation cautious_count confident_count aligned_count first_adaptation_speaker_count
1436 cautious parallel 1 1 1 0 1
1440 confident parallel 0 1 1 0 1
1459 cautious reverse 0 1 1 0 1
1470 confident reverse 0 1 1 0 1
1471 confident reverse 0 1 1 0 1
1473 confident parallel 0 1 1 0 1
1474 confident parallel 0 1 1 0 1
1485 confident parallel 0 1 1 0 1